Asynchronous Parallel Computing Algorithm implemented in 1D Heat Equation with \textsf{CUDA}
نویسندگان
چکیده
In this note, we present the stability as well as performance analysis of asynchronous parallel computing algorithm implemented in 1D heat equation with CUDA. The primary objective of this note lies in dissemination of asynchronous parallel computing algorithm by providing CUDA code for fast and easy implementation. We show that the simulations carried out on nVIDIA GPU device with asynchronous scheme outperforms synchronous parallel computing algorithm. In addition, we also discuss some drawbacks of asynchronous parallel computing algorithms.
منابع مشابه
Parallel hybrid PSO with CUDA for lD heat conduction equation
Objectives: We propose a parallel hybrid particle swarm optimization (PHPSO) algorithm to reduce the computation cost because solving the one-dimensional (1D) heat conduction equation requires large computational cost which imposes a great challenge to both common hardware and software equipments. Background: Over the past few years, GPUs have quickly emerged as inexpensive parallel processors ...
متن کاملAsynchronous Parallel Computing Model of Global Motion Estimation with CUDA
For video coding, weighing the balance between and coding rate image quality, we apply global motion search algorithm to avoid loss of image quality and parallel computing capacity of graphics processors to accelerate the encoding process. According to the heterogeneous system of CPU+GPU, and the multi-threaded parallel structure, thread synchronization features of CUDA platform, we build a pro...
متن کاملPerformance Comparison of Asynchronous Transfer Configurations for UHD Game Image Compression with GPGPU
Ultra high definition (UHD) game scenes have caused the memory bandwidth problem. The lossless DPCM-GR based compression algorithm [12] using NVIDIA CUDA(Compute Unified Device Architecture) like general purpose GPU (GPGPU) computing relieves the bandwidth problem without sacrificing image quality, which supports bit parallel pipelining. This paper increases the memory bandwidth efficiency usin...
متن کاملWeighted Block - Asynchronous Relaxation for Gpu - Accelerated Systems ∗
In this paper, we analyze the potential of using weights for block-asynchronous relaxation methods on GPUs. For this purpose, we introduce different weighting techniques similar to those applied in block-smoothers for multigrid methods. Having proven a sufficient convergence condition for the weighted block-asynchronous iteration, we analyze the performance of the algorithms implemented using C...
متن کاملDiamondTorre Algorithm for High-Performance Wave Modeling
Effective algorithms of physical media numerical modeling problems’ solution are discussed. The computation rate of such problems is limited by memory bandwidth if implemented with traditional algorithms. The numerical solution of the wave equation is considered. A finite difference scheme with a cross stencil and a high order of approximation is used. The DiamondTorre algorithm is constructed,...
متن کامل